Class center-based firefly algorithm for handling missing data
نویسندگان
چکیده
Abstract A significant advancement that occurs during the data cleaning stage is estimating missing data. Studies have shown improper handling leads to inaccurate analysis. Furthermore, most studies indicate occurrence of irrespective correlation between attributes. However, an adaptive search procedure helps determine estimates when correlations attributes are considered in process. Firefly Algorithm (FA) implements imputation by determining estimated value closest others' value. Therefore, this study proposes a class center-based approach model for retrieving considering attribute process (C3-FA). The result showed firefly algorithm efficient technique obtaining actual with Pearson coefficient ( r ) and root mean squared error (RMSE) close 1 0, respectively. In addition, proposed method has ability maintain true distribution values. This indicated Kolmogorov–Smirnov test, which stated D KS dataset generally closer 0. accuracy evaluation results using three classifiers produces good accuracy.
منابع مشابه
Dynamic Replication based on Firefly Algorithm in Data Grid
In data grid, using reservation is accepted to provide scheduling and service quality. Users need to have an access to the stored data in geographical environment, which can be solved by using replication, and an action taken to reach certainty. As a result, users are directed toward the nearest version to access information. The most important point is to know in which sites and distributed sy...
متن کاملFamily-Based Association Tests with longitudinal measurements: handling missing data.
Several family-based approaches have been previously proposed to enhance the power for testing genetic association when the traits are measured longitudinally or repeatedly. In this paper, we show that some of these FBAT approaches can be easily extended to accommodate incomplete data and remain unbiased tests. We also show that because of the nature of FBAT approaches, we can impute the missin...
متن کاملMissing Data Handling in Multi-Layer Perceptron
Multi layer perceptron with back propagation algorithm is popular and more used than other neural network types in various fields of investigation as a non-linear predictor. Though MLP can solve complex and non-linear problems, it cannot use missing data for training directly. We propose a training algorithm with incomplete pattern data using conventional MLP network. Focusing on the fact that ...
متن کاملHandling Missing Values in Data Mining
Missing Values and its problems are very common in the data cleaning process. Several methods have been proposed so as to process missing data in datasets and avoid problems caused by it. This paper discusses various problems caused by missing values and different ways in which one can deal with them. Missing data is a familiar and unavoidable problem in large datasets and is widely discussed i...
متن کاملHandling Missing Data by Maximum Likelihood
Multiple imputation is rapidly becoming a popular method for handling missing data, especially with easy-to-use software like PROC MI. In this paper, however, I argue that maximum likelihood is usually better than multiple imputation for several important reasons. I then demonstrate how maximum likelihood for missing data can readily be implemented with the following SAS procedures: MI, MIXED, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Big Data
سال: 2021
ISSN: ['2196-1115']
DOI: https://doi.org/10.1186/s40537-021-00424-y